A large database DNA sequence handling program with generalized searching specifications
نویسنده
چکیده
The program described allows for the creation and manipulation of files of DNA sequence data up to very great lengths. The program uses its own paging system to load segments of the sequence into a small internal buffer so that the program does not have excessive memory requirements. The program offers a menu of functions to the user, and has been written to be forgiving of user errors. A code for the generalised specification of bases as a series of groups (i.e. A or T, Purine, etc.) has been devised and can be used in search specifications or in sequence files. Versions of the program have been developed to run with special efficiency under DIGITAL's RT11 operating system or to run under systems with a suitable implementation of FORTRAN VI.
منابع مشابه
Processing and population genetic analysis of multigenic datasets with ProSeq3 software
MOTIVATION The current tendency in molecular population genetics is to use increasing numbers of genes in the analysis. Here I describe a program for handling and population genetic analysis of DNA polymorphism data collected from multiple genes. The program includes a sequence/alignment editor and an internal relational database that simplify the preparation and manipulation of multigenic DNA ...
متن کاملSubtiList: the reference database for the Bacillus subtilis genome
SubtiList is the reference database dedicated to the genome of Bacillus subtilis 168, the paradigm of Gram-positive endospore-forming bacteria. Developed in the framework of the B.subtilis genome project, SubtiList provides a curated dataset of DNA and protein sequences, combined with the relevant annotations and functional assignments. Information about gene functions and products is continuou...
متن کاملIndexing Strategies for Rapid Searches of Short Words in Genome Sequences
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy...
متن کاملSubtiList: a relational database for the Bacillus subtilis genome.
In the framework of the international collaborative project aiming to sequence the whole Bacillus subtilis chromosome, we have created a relational database for managing and analysing information associated with the molecular genetics of this bacterium: SubtiList. It allows recovery of non-redundant DNA sequences of the B. subtilis genome, as well as related information, i.e. genes, proteins, e...
متن کاملMining patterns and rules for software specification discovery
Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potential...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Nucleic acids research
دوره 10 1 شماره
صفحات -
تاریخ انتشار 1982